survey.

3. Continue sampling until you have the size sample you need (or the time window expires).

Chapter 4 describes the software G*Power that can be used for making sample-size calculations.

In systematic sampling, you are technically starting at a random individual, then selecting

every kth member of the population, where k stands for the sampling number you selected.

Systematic sampling is not representative if there are any time-related cyclic patterns that

could confer periodicity onto the underlying data. For example, suppose that it was known that

most pediatric patients present to the emergency department between 6 p.m. and 8 p.m. If you

chose to collect data during this time window, even if you used systematic sampling, you would

undoubtedly oversample pediatric patients.

Sampling clusters

Another challenge you may face as a biostatistician when it comes to sampling from populations

occurs when you are studying an environmental exposure. The term exposure is from epidemiology

and refers to a factor hypothesized to have a causal impact on an outcome (typically a health

condition). Examples of environmental exposures that are commonly studied include air pollution

emitted from factories, high levels of contaminants in an urban water system, and environmental

pollution and other dangers resulting from a particular event (such as a natural disaster).

Consider the scenario where parents in a community are complaining that a local factory is emitting

pollutants that they believe is resulting in a higher rate of leukemia being diagnosed in the community’s

youth. To study whether the parents are correct or not, you need to sample members of the population

based on their proximity to the factory. This is where cluster sampling comes in.

Planning to do cluster sampling geographically starts with getting an accurate map of the area from

which you are sampling. In the United States, each state is divided up into counties, and each county is

further subdivided into smaller regions determined by the U.S. census. Other countries have similar

ways their maps can be divided along official geographic boundaries. In the scenario described where

a factory is thought to be polluting, the factory could be placed on the map and lines drawn around the

locations from which a sample should be drawn. Different methodologies are used depending upon the

specific study, but they usually involve taking an SRS of regions and from the sampled regions known

as clusters, taking an SRS of community members for study participation.

But cluster sampling is not only done geographically. As another example, clusters of schools

may be selected based on school district, rather than geography, and an SRS drawn from each

school. The important takeaway from cluster sampling is that it is a sampling strategy optimized

for drawing a representative sample when studying an exposure known to be uneven across the